library(knitr)
opts_knit$set(root.dir = "/Users/Morgan/Documents/Research/McMaster/Modeling_Work/Mike_Lunchbox/My_edits/lunchbox")
options(mc.cores = parallel::detectCores())
#opts_chunk$set(cache = TRUE)

NIMBLE: Numerical Inference for statistical Models for Bayesian and Likelihood Estimation

NIMBLE is built in R but compiles your models and algorithms using C++ for speed
NIMBLE is most commonly used for MCMC but can also be used to implement a series of other algorithms (e.g. particle filtering, MCEM)
1. A system for writing statistical models flexibly, which is an extension of the BUGS language
2. A library of algorithms such as MCMC.
3. A language, called NIMBLE, embedded within and similar in style to R, for writing algorithms that operate on BUGS models.

One of the most important concepts behind NIMBLE is to allow a combination of highlevel processing in R and low-level processing in compiled C++.

Why NIMBLE?
  1. Options (More customizable MCMC, ability to run JAGS models and STAN models, EM, particle filter) that leads to a more adaptable workflow
  2. User-defined functions and distributions – written as nimbleFunctions – can be used in model code.
  3. Multiple parameterizations for distributions, similar to those in R, can be used.
    e.g. normal distribution with BUGS parameter order:
    x ~ dnorm(a + b * c, tau)
    normal distribution with a named parameter:
    y ~ dnorm(a + b * c, sd = sigma)
  4. Named parameters for distributions and functions, similar to R function calls, can be used.
  5. More flexible indexing of vector nodes within larger variables is allowed. For example one can place a multivariate normal vector arbitrarily within a higher-dimensional object, not just in the last index.
  6. More general constraints can be declared using dconstraint, which extends the concept of JAGS’ dinterval.

Downloading, installing and loading NIMBLE

On Windows, you should download and install Rtools.exe available from http://cran. r-project.org/bin/windows/Rtools/.
On OS X, you should install Xcode.

After these are installed you can install NIMBLE in R using
install.packages(“nimble”, repos = “http://r-nimble.org”, type = “source”)

Please post about installation problems to the nimble-users Google group or email nimble.stats@gmail.com.

You will also need to download STAN using the following commands
Sys.setenv(MAKEFLAGS = “-j4”)
install.packages(“rstan”, dependencies = TRUE)

In total you will need the following pakages:

library("nimble")
library("R2jags")
library("ggplot2")
library("nimble")
library("rstan")
library("igraph")
library("parallel")
library("mcmcplots")
library("lattice")
library("coda")

Things to know about working with NIMBLE

Programming in NIMBLE involves a fundamental distinction between:
1. the steps for an algorithm that need to happen only once, at the beginning, such as inspecting the model
2. the steps that need to happen each time a function is called, such as MCMC iterations.
When one writes a nimbleFunction, each of these parts can be provided separately.

Multiple parameterizations for distributions, similar to those in R, can be used. NIMBLE calls non-stochastic nodes “deterministic”, whereas BUGS calls them “logical”. NIMBLE uses “logical” in the way R does, to refer to boolean (TRUE/FALSE) variables.
Alternative models can be defined from the same model code by using if-then-else statements that are evaluated when the model is defined.

  1. NIMBLE extracts all the declarations in the BUGS code to create a model definition.
  2. From the model definition, NIMBLE builds a working model in R. This can be used to manipulate variables and operate the model from R. Operating the model includes calculating, simulating, or querying the log probability value of model nodes.
  3. From the working model, NIMBLE generates customized C++ code representing the model, compiles the C++, loads it back into R, and provides an R object that interfaces to it. We often call the uncompiled model the “R-model” and the compiled model the “C-model.”

Presentation Outline

The general outline for this presentation follows along with the NIMBLE users manual
http://r-nimble.org/documentation-2
However, the model(s) used here are written by us

Part 1

1.1 Build a chain binomial model in JAGS. Conduct parameter estimation
1.2 Translate the model into NIBLE. Conduct parameter estimation
1.2.1 Model exploration/conversion
1.2.2 Create a basic MCMC specification for the chain binomial, compile and run the MCMC
1.2.3 Small MCMC specification adjustments (more on this in Part 3)
1.3 Compare the JAGS and NIMBLE results (parameter estimates, uncertainty, convergence, efficiency)

Part 2

2.1 Translate the model using a “hybrid approach” (STAN does not allow for discrete latent variables)
2.1.1 Conduct parameter estimation using JAGS and NIMBLE
2.1.2 Run the hybrid model in STAN and compare the results from JAGS, NIMBLE and STAN
2.2 Compare the NIMBLE Chain Binomial and STAN hybrid model

Part 3

3.1 Expolore more fine-tuned adjustments that can be made in NIMBLE
3.1.1 NIMBLE functions (e.g. allows for the implementation of custom samplers)

Part 4

4.1 NIMBLE extras:
4.1.1 Create, compile and run a Monte Carlo Expectation Maximization (MCEM) algorithm, which illustrates some of the flexibility NIMBLE provides to combine R and NIMBLE.
4.1.2 Implement particle filtering for the chain binomial

Part 5

5.1 Misc NIMBLE notes (truncated distributions, lifted nodes, logProb)

Part 1

1.1 Build a chain binomial model in JAGS

First step is to construct the simulator from which we will obtain our data

Note: It will be important to set your current working directory to “../stat744/notes/NIMBLE”

Set parameters and load the Chain Binomial simulator

beta <- 0.02
pop <- 100
effpropS <- 0.8
effpropI <- 0.2
reporting <- 0.5

s0 <- effpropS*pop
r0 <- 0
zerohack <- 0.001
numobs <- 12
nimtimevec <- c()
source("CBsimulator.R")
sim <- simCB(beta = beta, pop = pop, effpropS = effpropS, effpropI = effpropI, 
             t0 = 1, numobs = numobs, reporting = reporting, seed = 3)
sim
##    time  S  I  R Iobs
## 1     1 80  5  0    1
## 2     2 70 10  5    5
## 3     3 59 11 15    6
## 4     4 47 12 26    4
## 5     5 38  9 38    5
## 6     6 31  7 47    4
## 7     7 27  4 54    2
## 8     8 25  2 58    2
## 9     9 23  2 60    0
## 10   10 22  1 62    1
## 11   11 22  0 63    0
## 12   12 22  0 63    0

Take a peek at what this model produces

Set up the required arguments to run the JAGS model

data <- list(obs = sim$Iobs, pop = pop, numobs = nrow(sim), r0 = r0)

inits <- list(list(I = sim$I * 1, effpropS = effpropS - 0.1, effpropI = effpropI - 
    0.15, beta = beta + 0.05, reporting = reporting), list(I = sim$I * 1 + 1, 
    effpropS = effpropS - 0.1, effpropI = effpropI + 0.2, beta = beta + 0.1, 
    reporting = reporting), list(I = sim$I * 1 + 2, effpropS = effpropS, effpropI = effpropI, 
    beta = beta - 0.008, reporting = reporting))

params = c("beta", "effpropS", "effpropI", "reporting")

# rjags::set.factory('bugs::Conjugate', FALSE, type='sampler')

Create the model and examine the MCMC algorithms that JAGS will use to sample

cbjagsmodel <- jags.model(data = data,
               inits = inits,
               file = "CB.bug",
               n.chains = length(inits))
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 84
## 
## Initializing model
list.samplers(cbjagsmodel)
## $ShiftedCount
## [1] "I[12]"
## 
## $DiscreteSlicer
## [1] "I[11]"
## 
## $DiscreteSlicer
## [1] "I[10]"
## 
## $DiscreteSlicer
## [1] "I[9]"
## 
## $DiscreteSlicer
## [1] "I[8]"
## 
## $DiscreteSlicer
## [1] "I[7]"
## 
## $DiscreteSlicer
## [1] "I[6]"
## 
## $DiscreteSlicer
## [1] "I[5]"
## 
## $DiscreteSlicer
## [1] "I[4]"
## 
## $DiscreteSlicer
## [1] "I[3]"
## 
## $DiscreteSlicer
## [1] "I[2]"
## 
## $DiscreteSlicer
## [1] "I[1]"
## 
## $DiscreteSlicer
## [1] "S[1]"
## 
## $RealSlicer
## [1] "beta"
## 
## $ConjugateBeta
## [1] "effpropI"
## 
## $ConjugateBeta
## [1] "effpropS"
## 
## $ConjugateBeta
## [1] "reporting"

Run some chains (could use coda::coda.samples from cbjagsmodel but here we will just run jags())

jagstime <- system.time(cbjags <- jags(data = data,
               inits = inits,
               param = params,
               model.file = "CB.bug",
               n.iter = 11000,
               n.burnin = 1000,
               n.thin = 20,
               n.chains = length(inits)))
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 84
## 
## Initializing model
cbjags
## Inference for Bugs model at "CB.bug", fit using jags,
##  3 chains, each with 11000 iterations (first 1000 discarded), n.thin = 20
##  n.sims = 1500 iterations saved
##           mu.vect sd.vect   2.5%    25%    50%    75%  97.5%  Rhat n.eff
## beta        0.022   0.007  0.012  0.017  0.020  0.025  0.039 1.002  1200
## effpropI    0.391   0.270  0.035  0.155  0.331  0.589  0.943 1.003   730
## effpropS    0.820   0.128  0.506  0.753  0.855  0.918  0.974 1.006   430
## reporting   0.481   0.119  0.298  0.391  0.464  0.553  0.762 1.005   370
## deviance   29.965   3.350 23.533 27.913 29.820 32.034 37.025 1.008   290
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.6 and DIC = 35.5
## DIC is an estimate of expected predictive error (lower deviance is better).
xyplot(as.mcmc(cbjags))

1.2 Build the NIMLE model
source('nimCB.R')

Set up the model. Here we need: Constants, Data, Initial Values, NIMBLE model object

nimCBdata <- list(obs = sim$Iobs)

nimCBcon <- list(numobs = numobs, pop = pop, r0 = r0)

nimCBinits <- list(I = sim$I,
                   effpropS = effpropS,
                   effpropI = effpropI,
                   beta = beta,
                   reporting = reporting,
                   s0 = s0)

nimtimevec[1] <- system.time(CBout <- nimbleModel(code = nimcode, 
                         name = 'CBout', 
                         constants = nimCBcon,
                         data = nimCBdata, 
                         inits = nimCBinits))[3]
## defining model...
## building model...
## setting data and initial values...
## checking model...   (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
1.2.1 Model exploration/conversion
CBout$getNodeNames()
##  [1] "reporting"            "effpropS"             "effpropI"            
##  [4] "beta"                 "R[1]"                 "s0"                  
##  [7] "lifted_d100_minus_s0" "S[1]"                 "I[1]"                
## [10] "pSI[1]"               "obs[1]"               "R[2]"                
## [13] "I[2]"                 "S[2]"                 "R[3]"                
## [16] "pSI[2]"               "obs[2]"               "I[3]"                
## [19] "S[3]"                 "R[4]"                 "pSI[3]"              
## [22] "obs[3]"               "I[4]"                 "S[4]"                
## [25] "R[5]"                 "pSI[4]"               "obs[4]"              
## [28] "I[5]"                 "S[5]"                 "R[6]"                
## [31] "pSI[5]"               "obs[5]"               "I[6]"                
## [34] "S[6]"                 "R[7]"                 "pSI[6]"              
## [37] "obs[6]"               "I[7]"                 "S[7]"                
## [40] "R[8]"                 "pSI[7]"               "obs[7]"              
## [43] "I[8]"                 "S[8]"                 "R[9]"                
## [46] "pSI[8]"               "obs[8]"               "I[9]"                
## [49] "S[9]"                 "R[10]"                "pSI[9]"              
## [52] "obs[9]"               "I[10]"                "S[10]"               
## [55] "R[11]"                "pSI[10]"              "obs[10]"             
## [58] "I[11]"                "S[11]"                "R[12]"               
## [61] "pSI[11]"              "obs[11]"              "I[12]"               
## [64] "S[12]"                "pSI[12]"              "obs[12]"
CBout$obs
##  [1] 1 5 6 4 5 4 2 2 0 1 0 0
par(mfrow = c(1,1))
plot(CBout$graph)

nimbleModel does its best to initialize a model, but let’s say you want to re-initialize I.

simulate(CBout, 'I')
CBout$I
##  [1]  3  6 15 16 10  6  3  1  2  0  0  0
CBout$getDependencies(c("beta"))
##  [1] "beta"    "pSI[1]"  "I[2]"    "pSI[2]"  "I[3]"    "pSI[3]"  "I[4]"   
##  [8] "pSI[4]"  "I[5]"    "pSI[5]"  "I[6]"    "pSI[6]"  "I[7]"    "pSI[7]" 
## [15] "I[8]"    "pSI[8]"  "I[9]"    "pSI[9]"  "I[10]"   "pSI[10]" "I[11]"  
## [22] "pSI[11]" "I[12]"   "pSI[12]"
CBout$getDependencies(c("beta"), determOnly = TRUE)
##  [1] "pSI[1]"  "pSI[2]"  "pSI[3]"  "pSI[4]"  "pSI[5]"  "pSI[6]"  "pSI[7]" 
##  [8] "pSI[8]"  "pSI[9]"  "pSI[10]" "pSI[11]" "pSI[12]"
1.2.2 Compile the C++ model, build the MCMC and Run
nimtimevec[2] <- system.time(CBoutC <- compileNimble(CBout))[3]

Configure the MCMC with the default options (we will return to customizing this setup later)

nimtimevec[3] <- system.time(CBoutSpec <- configureMCMC(CBout, print = TRUE))[3]
## [1] RW sampler: reporting,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [2] RW sampler: effpropS,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [3] RW sampler: effpropI,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [4] RW sampler: beta,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [5] slice sampler: s0,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [6] slice sampler: I[1],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [7] slice sampler: I[2],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [8] slice sampler: I[3],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [9] slice sampler: I[4],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [10] slice sampler: I[5],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [11] slice sampler: I[6],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [12] slice sampler: I[7],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [13] slice sampler: I[8],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [14] slice sampler: I[9],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [15] slice sampler: I[10],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [16] slice sampler: I[11],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [17] slice sampler: I[12],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100

Add chain monitors for the parameters of interest and add thinning

CBoutSpec$addMonitors(c("beta", "effpropS", "effpropI", "reporting"))
## thin = 1: reporting, effpropS, effpropI, beta
CBoutSpec$setThin(20)
## thin = 20: reporting, effpropS, effpropI, beta

Build the MCMC

nimtimevec[4] <- system.time(CBoutMCMC <- buildMCMC(CBoutSpec))[3]

nimtimevec[5] <- system.time(CBoutMCMC <- compileNimble(CBoutMCMC, project = CBout, resetFunctions = TRUE))[3]
niter <- 11000

set.seed(0)
nimtimevec[6] <- system.time(CBoutMCMC$run(niter))[3]

Quick peek at time required. Below we will take a look at efficiency using a convenient coda function

Gross time required

jagstime[3]
## elapsed 
##  12.015
sum(nimtimevec[1:6], na.rm = TRUE)
## [1] 16.58
nimtimevec[6]
## [1] 0.94

Efficiency (Net time in a sense)

samples <- as.matrix(CBoutMCMC$mvSamples)

head(samples)
##      beta   effpropI  effpropS reporting
## [1,] 0.02 0.20731932 0.8427785 0.5000000
## [2,] 0.02 0.05087168 0.7887759 0.6916757
## [3,] 0.02 0.28765490 0.7887759 0.5316554
## [4,] 0.02 0.14800198 0.7111727 0.6957240
## [5,] 0.02 0.30758335 0.7386033 0.6028016
## [6,] 0.02 0.33371754 0.8991708 0.4605610
jags_eff <- effectiveSize(as.mcmc.list(as.mcmc(cbjags))) / nimtimevec[1]
nim_eff <- effectiveSize(as.mcmc.list(as.mcmc(samples))) / nimtimevec[6]

jags_eff
##      beta  deviance  effpropI  effpropS reporting 
##  439.3923  814.5146  529.4877  315.7712  492.6933
nim_eff
##      beta  effpropI  effpropS reporting 
##  80.84285 131.48813  71.82709 128.10869

Save these points for later

  def_effS <- samples[ , 'effpropS']
  def_effI <- samples[ , 'effpropI']

Look at the correlation in the chains

  acf(samples[, "beta"])

  acf(samples[, "reporting"])

  acf(samples[, "effpropS"])

  acf(samples[, "effpropI"])

1.2.3 Small MCMC specification adjustment

A few undesirable results here… we can add a block sampler to decrease correlation

Take a look at the samplers being used

CBoutSpec$getSamplers()
## [1]  RW sampler: reporting,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [2]  RW sampler: effpropS,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [3]  RW sampler: effpropI,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [4]  RW sampler: beta,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [5]  slice sampler: s0,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [6]  slice sampler: I[1],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [7]  slice sampler: I[2],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [8]  slice sampler: I[3],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [9]  slice sampler: I[4],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [10] slice sampler: I[5],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [11] slice sampler: I[6],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [12] slice sampler: I[7],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [13] slice sampler: I[8],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [14] slice sampler: I[9],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [15] slice sampler: I[10],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [16] slice sampler: I[11],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [17] slice sampler: I[12],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addSampler(target = c('effpropS', 'effpropI'), type = 'RW_block',
                      control = list(adaptInterval = 10000))
## [18] RW_block sampler: effpropS, effpropI,  adaptive: TRUE,  adaptScaleOnly: FALSE,  adaptInterval: 10000,  scale: 1,  propCov: identity
CBoutSpec$setThin(30)
## thin = 30: reporting, effpropS, effpropI, beta
CBoutMCMC <- buildMCMC(CBoutSpec)
CBoutMCMC <- compileNimble(CBoutMCMC, project  = CBout, resetFunctions = TRUE)
CBoutMCMC$run(30000)
## NULL
samplesNew <- as.matrix(CBoutMCMC$mvSamples)

Check for an imporvement

  par(mfrow = c(2,2))
  acf(samplesNew[, "effpropS"])
  acf(samplesNew[, "effpropI"])
  plot(samplesNew[ , 'effpropS'], type = 'l', xlab = 'iteration')
  plot(samplesNew[ , 'effpropI'], type = 'l', xlab = 'iteration')

  par(mfrow = c(1,1))
  plot(samplesNew[ , 'effpropS'], samplesNew[ , 'effpropI'], pch = 20)
  points(def_effS, def_effI, pch = 20, col = "blue")

Well that didn’t do anything…

NIMBLE allows for specification of samplers by parameter or node by node (NIMBLE included or user created)

CBout <- nimbleModel(code = nimcode, 
                         name = 'CBout', 
                         constants = nimCBcon,
                         data = nimCBdata, 
                         inits = nimCBinits)
## defining model...
## building model...
## setting data and initial values...
## checking model...   (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
CBoutC <- compileNimble(CBout)
  
CBoutSpec <- configureMCMC(CBout, print = TRUE)
## [1] RW sampler: reporting,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [2] RW sampler: effpropS,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [3] RW sampler: effpropI,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [4] RW sampler: beta,  adaptive: TRUE,  adaptInterval: 200,  scale: 1
## [5] slice sampler: s0,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [6] slice sampler: I[1],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [7] slice sampler: I[2],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [8] slice sampler: I[3],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [9] slice sampler: I[4],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [10] slice sampler: I[5],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [11] slice sampler: I[6],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [12] slice sampler: I[7],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [13] slice sampler: I[8],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [14] slice sampler: I[9],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [15] slice sampler: I[10],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [16] slice sampler: I[11],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [17] slice sampler: I[12],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addMonitors(c("beta", "effpropS", "effpropI", "reporting"))
## thin = 1: reporting, effpropS, effpropI, beta
CBoutSpec$setThin(20)
## thin = 20: reporting, effpropS, effpropI, beta
CBoutSpec$removeSamplers(c("beta", "effpropS", "effpropI", "reporting"), print = TRUE)
## [1]  slice sampler: s0,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [2]  slice sampler: I[1],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [3]  slice sampler: I[2],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [4]  slice sampler: I[3],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [5]  slice sampler: I[4],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [6]  slice sampler: I[5],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [7]  slice sampler: I[6],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [8]  slice sampler: I[7],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [9]  slice sampler: I[8],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [10] slice sampler: I[9],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [11] slice sampler: I[10],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [12] slice sampler: I[11],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
## [13] slice sampler: I[12],  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addSampler("beta", type = "slice", print = TRUE)
## [14] slice sampler: beta,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addSampler("effpropS", type = "slice", print = TRUE)
## [15] slice sampler: effpropS,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addSampler("effpropI", type = "slice", print = TRUE)
## [16] slice sampler: effpropI,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutSpec$addSampler("reporting", type = "slice", print = TRUE)
## [17] slice sampler: reporting,  adaptive: TRUE,  adaptInterval: 200,  sliceWidth: 1,  sliceMaxSteps: 100
CBoutMCMC <- buildMCMC(CBoutSpec)
CBoutMCMC <- compileNimble(CBoutMCMC, project  = CBout, resetFunctions = TRUE)
CBoutMCMC$run(30000)
## NULL
samplesNew <- as.matrix(CBoutMCMC$mvSamples)
  par(mfrow = c(2,2))
  acf(samplesNew[, "effpropS"])
  acf(samplesNew[, "effpropI"])
  plot(samplesNew[ , 'effpropS'], type = 'l', xlab = 'iteration')
  plot(samplesNew[ , 'effpropI'], type = 'l', xlab = 'iteration')

  par(mfrow = c(1,1))
  plot(samplesNew[ , 'effpropS'], samplesNew[ , 'effpropI'], pch = 20)
  points(def_effS, def_effI, pch = 20, col = "blue")

1.3 Compare the JAGS and NIMBLE results

We can also compare the NIMBLE model simultaneously with the JAGS model using MCMCsuite()

Be warned: running this code will produce about 6-8 graphs which will all pop up in separate windows!

nimcb <- MCMCsuite(code = nimcode,
                   data = nimCBdata,
                   inits = nimCBinits,
                   constants = nimCBcon,
                   MCMCs = c("jags", "nimble"),
                   monitors = c("beta", "reporting", "effpropS", "effpropI"),
                   niter = 12000,
                   calculateEfficiency = TRUE,
                   makePlot = FALSE,
                   savePlot = FALSE)
## defining model...
## building model...
## setting data and initial values...
## checking model...   (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 83
## 
## Initializing model
  nimcb$summary
## , , beta
## 
##              mean     median          sd   CI95_low   CI95_upp     n
## jags   0.02211968 0.02044865 0.007749308 0.01228891 0.04245738 10000
## nimble 0.02238331 0.02026432 0.008585741 0.01212697 0.04593279 10000
##             ess efficiency
## jags   195.3439   62.25109
## nimble  83.1905   79.07843
## 
## , , reporting
## 
##             mean    median        sd  CI95_low  CI95_upp     n      ess
## jags   0.5178239 0.4943196 0.1396390 0.3072968 0.8401621 10000 269.0024
## nimble 0.5291161 0.5031207 0.1402128 0.3107321 0.8428592 10000 223.0523
##        efficiency
## jags     85.72416
## nimble  212.02693
## 
## , , effpropS
## 
##             mean    median        sd  CI95_low  CI95_upp     n       ess
## jags   0.7974094 0.8421026 0.1448468 0.4557955 0.9703215 10000 149.14588
## nimble 0.7917268 0.8326756 0.1502838 0.4371907 0.9729178 10000  52.59819
##        efficiency
## jags     47.52896
## nimble   49.99827
## 
## , , effpropI
## 
##             mean    median        sd   CI95_low  CI95_upp     n      ess
## jags   0.3706128 0.2992648 0.2725763 0.03221043 0.9491894 10000 234.6599
## nimble 0.3646788 0.2788137 0.2803558 0.02503840 0.9535620 10000 118.3843
##        efficiency
## jags     74.78009
## nimble  112.53258
  nimcb$efficiency
## $min
##     jags   nimble 
## 47.52896 49.99827 
## 
## $mean
##      jags    nimble 
##  67.57107 113.40905
  nimcb$timing
##           jags         nimble nimble_compile 
##          3.138          1.052         13.866
2.1 “hybrid approach”

We must rewrite the model so that there are no discrete latent variables. We call this the “hybrid model”
An asside – Discrete Latent Variables:
An additional asside – Hamiltonian MCMC:

But before we fit the model in STAN lets explore the hybrid model in NIMBLE

NIMBLE allows us to compare the results of multiple models even if they have different parameterizations (e.g. Chain Binomial and the Hybrid Model)

data$obs <- data$obs + zerohack # Guarnantee that obs remains above 0 (important for the gamma)
data$zerohack <- zerohack

hybridjags <- jags(data = data,
               inits = inits,
               param = params,
               model.file = "hybrid.bug",
               n.iter = 8000,
               n.chains = length(inits))
2.1.1 Hybrid in JAGS and NIMBLE
source('nimhybrid.R')
nimhydata <- list(obs = sim$Iobs + zerohack)
nimhycon <- list(numobs = numobs, pop = pop, r0 = r0, zerohack = zerohack)

nimhyinits <- list(I = sim$I + zerohack,
                   effpropS = effpropS,
                   effpropI = effpropI,
                   beta = beta,
                   reporting = reporting,
                   s0 = s0)
nimcb <- MCMCsuite(code = nimcode,
                   data = nimhydata,
                   inits = nimhyinits,
                   constants = nimhycon,
                   MCMCs = c("jags", "nimble"),
                   monitors = c("beta", "reporting", "effpropS", "effpropI"),
                   niter = 10000,
                   makePlot = FALSE,
                   savePlot = FALSE)
2.1.2 Hybrid in JAGS, NIMBLE and STAN

Run the STAN model

stantime <- system.time (s1 <- stan(file='hybrid.stan', data = data, init = inits,
           pars=c("beta", "reporting", "effpropS", "effpropI", "I"), iter = 8000,
           seed = 1001, chains = length(inits)))

Compare all three methods using the hybrid model

nimhydata <- list(obs = sim$Iobs + zerohack)
nimhycon <- list(numobs = numobs, pop = pop, 
                 r0 = r0, zerohack = zerohack)

nimhyinits <- list(I = sim$I + zerohack,
                   effpropS = effpropS,
                   effpropI = effpropI,
                   beta = beta,
                   reporting = reporting,
                   s0 = s0)
allhybrids <- MCMCsuite(code = nimcode,
                   data = nimhydata,
                   inits = nimhyinits,
                   constants = nimhycon,
                   stan_model = "hybrid.stan",
                   MCMCs = c("jags", "nimble", "stan"),
                   monitors = c("beta", "reporting", "effpropS", "effpropI"),
                   niter = 10000,
                   makePlot = FALSE,
                   savePlot = FALSE)
2.2 Finally, compare the Chain Binomial NIMBLE and Hybrid STAN
nimCBdata <- list(obs = sim$Iobs)
nimCBcon <- list(numobs = numobs, pop = pop, r0 = r0, zerohack = zerohack)

nimCBinits <- list(I = sim$I,
                   effpropS = effpropS,
                   effpropI = effpropI,
                   beta = beta,
                   reporting = reporting,
                   s0 = s0)
nimcb <- MCMCsuite(code = nimcode,
                   data = nimCBdata,
                   inits = nimCBinits,
                   constants = nimCBcon,
                   stan_model = "hybrid.stan",
                   MCMCs = c("jags", "nimble", "stan"),
                   monitors = c("beta", "reporting", "effpropS", "effpropI"),
                   niter = 10000,
                   makePlot = TRUE,
                   savePlot = TRUE)

Part 3

3.1.1 See pg 87 in the NimbleUserManual for custom MCMC sampler
## the name of this sampler function, for the purposes of ## adding it to MCMC configurations, will be 'my_RW' my_RW <- nimbleFunction(
    ## sampler functions must contain 'sampler_BASE'
    contains = sampler_BASE,
## sampler functions must have exactly these setup arguments: ## model, mvSaved, target, control
setup = function(model, mvSaved, target, control) {
        ## first, extract the control list elements, which will
        ## dictate the behavior of this sampler.
        ## the setup code will be later processed to determine
        ## all named elements extracted from the control list.
        ## these will become the required elements for any
        ## control list argument to this sampler, unless they also
        ## exist in the NIMBLE system option 'MCMCcontrolDefaultList'.
        ## the random walk proposal standard deviation
        scale <- control$scale
        ## determine the list of all dependent nodes,
        ## up to the first layer of stochastic nodes, generally
        ## called 'calcNodes'.  The values, inputs, and logProbs
        ## of these nodes will be retrieved and/or altered
        ## by this algorithm.
calcNodes <- model$getDependencies(target) 
},

## the run function must accept no arguments, execute
## the sampling algorithm, leave the modelValues object
## 'mvSaved' as an exact copy of the updated values in model, ## and have no return value. initially, mvSaved contains
## an exact copy of the values and logProbs in the model.
run = function() {
    ## extract the initial model logProb
model_lp_initial <- getLogProb(model, calcNodes) ## generate a proposal value for target node
    proposal <- rnorm(1, model[[target]], scale)
    ## store this proposed value into the target node.
    ## notice the double assignment operator, `<<-`,
    ## necessary because 'model' is a persistent member
    ## data object of this sampler.
    model[[target]] <<- proposal
## calculate target_logProb, propagate the
## proposed value through any deterministic dependents, ## and calculate the logProb for any stochastic
## dependnets. The total (sum) logProb is returned. model_lp_proposed <- calculate(model, calcNodes)
    ## calculate the log Metropolis-Hastings ratio
    log_MH_ratio <- model_lp_proposed - model_lp_initial
## Metropolis-Hastings step: determine whether or ## not to accept the newly proposed value
u <- runif(1, 0, 1)
if (u < exp(log_MH_ratio)) jump <- TRUE
    else jump <- FALSE
## if we accepted the proposal, then store the updated ## values and logProbs from 'model' into 'mvSaved'.
## if the proposal was not accepted, restore the values ## and logProbs from 'mvSaved' back into 'model'. if(jump) copy(from = model, to = mvSaved, row = 1,
nodes = calcNodes, logProb = TRUE) else copy(from = mvSaved, to = model, row = 1, nodes = calcNodes, logProb = TRUE)
},
    ## sampler functions must have a member method 'reset',
    ## which takes no arguments and has no return value.
    ## this function is used to reset the sampler to its
    ## initial state.  since this sampler function maintains
    ## no internal member data variables, reset() needn't
    ## do anything.
    methods = list(
reset = function () {} )
)
## now, assume the existence of an R model object 'Rmodel',
## which has a scalar-valued stochastic node 'x'
## create an MCMC configuration with no sampler functions
mcmcspec <- configureMCMC(Rmodel, nodes = NULL)
## add our custom-built random walk sampler on node 'x', ## with a fixed proposal standard deviation = 0.1 mcmcspec$addSampler(target = 'x', type = 'my_RW',
control = list(scale = 0.1))
Rmcmc <- buildMCMC(mcmcspec) ## etc...

Part 4

4.1.1 Mote Carlo Expectation Maximization

Suppose we have a model with missing data (or a layer of latent variables that can be treated as missing data) and we would like to maximize the marginal likelihood of the model, integrating over the missing data. A brute-force method for doing this is MCEM.

Start by building the model

hybemout <- nimbleModel(code = nimcode, 
                         name = 'hybemout', 
                         constants = nimhycon,
                         data = nimhydata, 
                         inits = nimhyinits)

hybMCEM <- buildMCEM(model = hybemout, latentNodes = list("I", "s0"), 
                      burnIn = 100, mcmcControl = list(adaptInterval = 20), 
                      boxConstraints = list( list( c("beta", "reporting", "effpropS", "effpropI"), 
                                                   limits = c(0, Inf) ) ), 
                        buffer = 1e-6)

# The MCEM algorithm allows for box constraints on the nodes that will be optimized,
# specified via the boxConstraints argument. This is highly recommended for nodes that
# have zero density on parts of the real line
CBMCEM(maxit = 20, m1 = 150, m2 = 200)
4.1.2 Particle Filter

Set up the Nimble model

CBpfout <- nimbleModel(code = nimcode, 
                         name = 'CBpfout', 
                         constants = nimCBcon,
                         data = nimCBdata,
                         check = FALSE)

Build the particle filter

CBpfoutC <- compileNimble(CBpfout)

CBpf <- buildPF(CBpfout, c("I"))
CBpfC <- compileNimble(CBpf, project = CBpfout)

Set your parameters

CBpfoutC$beta = 0.02
CBpfoutC$effpropS = 0.8
CBpfoutC$effpropI = 0.2
CBpfoutC$reporting = 0.5

Obtain log-likelihood

CBpfC$run(m = 5000)

Currently relatively useless as is…

Use this framework to construct your own updater

Part 5:

NIMBLE Notes

Truncation of distributions
\(x \∼ N(0, sd = 10) T(0, a)\), or
• x ∼ T(dnorm(0, sd = 10), 0, a),

• mu1 ~ dnorm(0, 1)
• mu2 ~ dnorm(0, 1)
• constraint_data ~ dconstraint( mu1 + mu2 > 0 )

Lifted Nodes
• The use of link functions causes new nodes to be introduced
• When distribution parameters are expressions, NIMBLE creates a new deterministic node that contains the expression for a given parameter

logProb
• For each variable that contains at least one stochastic node, NIMBLE generates a model variable with the prefix “logProb”
• Can be retrieved with getLogProb

Choice of Samplers
1. If the node has no stochastic dependents, a predictive end sampler is assigned. The end sampling algorithm merely calls simulate on the particular node.
2. The node is checked for presence of a conjugate relationship between its prior distribution and the distributions of its stochastic dependents. If it is determined to be in a conjugate relationship, then the corresponding conjugate (Gibbs) sampler is assigned.
3. If the node is discrete-valued, then a slice sampler is assigned [5].
4. If the node follows a multivariate distribution, then a RW block sampler is assigned for all elements. This is a Metropolis-Hastings adaptive random-walk sampler with a multivariate normal proposal [6].
5. If none of the above criteria are satisfied, then a RW sampler is assigned. This is a Metropolis-Hastings adaptive random-walk sampler with a univariate normal proposal distribution.

Missing Values
See pg 43